HIV-associated vaginal microbiome and inflammation predict spontaneous preterm birth in Zambia

A Lactobacillus-deficient, anaerobe-rich vaginal microbiome has been associated with local inflammation and spontaneous preterm birth (sPTB), but few studies have assessed this association in the setting of HIV. We performed metagenomic sequencing and inflammatory marker assays on vaginal swabs collected in pregnancy. We grouped samples into 7 metagenomic clusters (mgClust) using the non-redundant VIRGO catalogue, and derived inflammatory scores by factor analysis. Of 221 participants, median Shannon diversity index (SDI) was highest in HIV+ with detectable viral load (1.31, IQR: 0.85–1.66; p < 0.001) and HIV+ with undetectable virus (1.17, IQR: 0.51–1.66; p = 0.01) compared to HIV− (0.74, IQR: 0.35–1.26). Inflammatory scores positively correlated with SDI (+ 0.66, 95%CI 0.28, 1.03; p = 0.001), highest among anaerobe-rich mgClust2–mgClust6. HIV was associated with predominance of anaerobe-rich mgClust5 (17% vs. 6%; p = 0.02) and mgClust6 (27% vs. 11%; p = 0.002). Relative abundance of a novel Gardnerella metagenomic subspecies > 50% predicted sPTB (RR 2.6; 95%CI: 1.1, 6.4) and was higher in HIV+ (23% vs. 10%; p = 0.001). A novel Gardnerella metagenomic subspecies more abundant in women with HIV predicted sPTB. The risk of sPTB among women with HIV may be mediated by the vaginal microbiome and inflammation, suggesting potential targets for prevention.


Scientific Reports
| (2022) 12:8573 | https://doi.org/10.1038/s41598-022-12424-w www.nature.com/scientificreports/ In previous analyses, we demonstrated that women with HIV, and particularly those who had not yet started ART at the time of conception, had higher vaginal bacterial diversity and vaginal inflammatory markers 12,13 . In this follow-up analysis employing whole community metagenomic sequencing and parallel cytokine assays, we investigate whether characteristics of the vaginal microbiome and local immune response are correlated and predict sPTB in women with and without HIV. Metagenomes afford higher resolution characterization of the composition of the vaginal microbiome and clustering both on taxonomic (species) and functional (genes) attributes 14 . Elucidating biological processes by which HIV infection may incite spontaneous preterm labor and delivery could inform the development of effective therapeutic interventions for the prevention of HIV-related prematurity and its consequences.

Methods
Study design. The Zambian Preterm Birth Prevention Study (ZAPPS; ClinicalTrials.gov Identifier: NCT02738892) is an ongoing prospective antenatal cohort at the Women and Newborn Hospital of the University Teaching Hospitals (UTH) in Lusaka. Full study procedures have been described previously 15 . Briefly, ZAPPS participants are enrolled prior to 24 gestational weeks and receive comprehensive antenatal care, laboratory testing, biological specimen collection, and ultrasound to establish gestational age. All deliveries prior to 37 gestational weeks were classified as provider-initiated or spontaneous by an on-site obstetrician (JP). sPTB was defined as delivery prior to 37 gestational weeks preceded by spontaneous preterm labor or pre-labor spontaneous rupture of membranes. Preterm labor inductions and pre-labor cesarean deliveries were considered provider-initiated preterm births and excluded from this analysis.
We defined exposures as HIV serostatus at cohort enrollment (HIV− vs. HIV+) and as a 3-level variable that further distinguished HIV+ participants by viral load suppression (i.e., undetectable vs. detectable). We determined HIV infection at enrollment by screening all participants using the SD Bioline 3.0 test (SD Biostandard Diagnostics, India) and confirming positive cases with Determine HIV-1/2 Ag/Ab Combo test (Alere Inc., Waltham, MA). Viral load assays were conducted using the Abbott RealTime HIV-1 Assay (Abbott Molecular, Des Plaines, IL) 16-18 . Ethics. The University of Zambia Biomedical Research Ethics Committee and the University of North Carolina Institutional Review Board each granted approval to conduct the ZAPPS study and for protocol-related specimen testing. All research was performed in accordance with relevant guidelines and regulations. Participants in ZAPPS provided individual written informed consent before undergoing study procedures.
Specimen collection. In ZAPPS, mid-vaginal dry polyester swabs collected at enrollment at subsequent antenatal visits are stored on-site at − 80 °C. Initial specimen selection for vaginal microbiome and cytokine analysis has been explained in detail previously 12,13 . Briefly, to maximize available budget, we used a double sampling technique that selected for participation all eligible HIV+ participants and all HIV− participants who delivered spontaneously prior to 37 gestational weeks. Among HIV− women who delivered at term, we selected participants at random at a proportion determined by available resources. Finally, we also analyzed repeat specimens collected between 24 and 36 gestational weeks from a random subset of HIV+ participants. All staff who performed laboratory analyses were blinded to baseline characteristics and clinical outcomes. DNA isolation, sequencing, & bioinformatic analysis. As described in detail previously 12 , we performed whole genome shotgun (WGS) sequencing of bacterial DNA extracted from vaginal swabs. To summarize, genomic DNA was isolated from vaginal swab samples, processed with the Nextera XT DNA Library Preparation Kit (Illumina), purified with Agencourt AMPure XP Reagent, and sequenced on an Illumina HiSeq 4000 system.
Sequencing output from the Illumina HiSeq platform was converted to FASTQ format and demultiplexed using Illumina Bcl2Fastq 2.18.0.12 conversion software (Illumina). Quality control of the demultiplexed sequencing reads were verified with FastQC software (Babraham Institute, Cambridge, UK). Sequencing reads originating from the host were removed using BMTagger (v3.101) 19 and the human reference genome GRCh38/hg38 20 . The data were then further processed using sortmeRNA (v2.1b) to remove ribosomal RNA sequencing reads 21 and then Trimmomatic (v0.3653) for quality filtering/trimming 22 . The remaining reads were then mapped to the VIRGO non-redundant gene catalog using Bowtie 23 (v1.2.2) as described previously 14 . Gene length corrected mapping results from VIRGO were used to establish the taxonomic composition of the vaginal microbiota 24 (Supplemental Table). Within-species diversity of these microbiota was established for only the four most prevalent bacteria and only included samples with at least 90% of the genes of the taxon's genome: L. crispatus (n = 18), L. iners (n = 115), Gardnerella (n = 152), and A. vaginae (n = 85). Briefly, the patterns of gene content for each taxon were subjected to hierarchical clustering using Bray-Curtis dissimilarity and Ward linkage. For each taxon two clusters were identified each representing a group of genes (i.e., a set of strains) co-occurring frequently in this dataset. These clusters were previously coined metagenomic subspecies 14 . Because only Gardnerella metagenomic subspecies demonstrated differences in incidence of sPTB, only this taxon's metagenomic subspecies were integrated in hierarchical clustering of samples using Bray-Curtis dissimilarity and Ward linkage, where the total abundance of Gardnerella in each sample was assigned to either metagenomic subspecies Gardnerella type 1 or Gardnerella type 2. For samples that did not meet our threshold for Gardnerella genome coverage to be included into the metagenomic subspecies analysis, the Gardnerella relative abundance was assigned to "Gardnerella other". Considering fluctuating taxonomy within the Gardnerella genus and for clarity of comparisons between metagenomic subspecies identified in our analysis and species of other genera, herein we refer to these three Gardnerella metagenomic types as "metagenomic subspecies". Statistical support for seven within-study metagenomic clusters www.nature.com/scientificreports/ was found using silhouette scores. Species level composition for Gardnerella was determined for each sample using VIRGO 25 . Shannon diversity index (SDI), a measure of species richness and evenness (alpha diversity), was calculated for each sample as a sum of each individual species' proportional abundance multiplied by the natural logarithm of this same proportion 26 .

Statistical analysis.
We analyzed baseline demographic data of the sub-study cohort, calculating median and interquartile range (IQR) for each continuous variable, and frequency and percentage for each categorical variable. We compared categorical and continuous variables between primary outcome groups using chi-square and linear regression, applying probability weighting to account for sampling technique. We described overall and relative taxonomic abundances, species diversity, individual inflammatory biomarkers, and inflammatory scores, quantifying their associations with HIV and viral suppression in unadjusted models. Inverse probability sampling weights were applied to account for unequal sampling technique, with robust standard errors computed by the linear variance estimator 27 .
Correlations between relative abundances of key bacterial taxa and inflammatory markers were estimated using Spearman's rank-order correlation coefficient (rho). A weighted linear regression model was built to estimate the association between individual inflammatory markers and SDI among samples collected between 16-20 gestational weeks, adjusting for HIV and detectable viral load at baseline. All inflammatory markers assayed were included in the initial model and then sequentially removed by stepwise backward elimination.
We calculated median SDI and inflammatory scores of baseline samples in each metagenomic cluster and estimated the relationship of each compared to L. crispatus-dominated mgClust7 using weighted linear regression. Similarly, SDI and inflammatory scores were calculated for baseline samples among each Gardnerella metagenomic subspecies classification. We reported linear associations of Shannon diversity and vaginal inflammation in samples with Gardnerella type 1 and type 2 compared to the "other" Gardnerella subtype overall and among subgroups of HIV serostatus.
We identified optimal cutpoints for dichotomizing mean relative abundances of key taxa that maximized the product of sensitivity and specificity to predict the outcome of sPTB using the method proposed by Liu 28 . Associations between vaginal microbiota and inflammation by sPTB were calculated as unadjusted and adjusted prevalence ratios using Poisson regression with robust error variance 29,30 . Because twin gestation and short cervical length (< 2.5 cm) are strong independent predictors of sPTB and rare in our cohort, we excluded women with these risk factors from analyses of this outcome. Multivariable models were adjusted for potential confounding due to maternal age, body mass index (BMI), parity, prior PTB, as well as HIV serostatus (in baseline analyses) or detectable viral load (for matched repeat analyses).
Two distinct metagenomic subspecies within the Gardnerella genus were identified in our samples. The main difference between these metagenomic subspecies involved two recently described, and closely related Gardnerella species: G. swidsinkii and G. leopoldii 25 . The first Gardnerella subspecies was characterized by a higher proportion of G. swidsinkii/G. leopoldii while the second contained a more diverse array of Gardnerella spp. and a low proportion of G. swidsinkii/G. leopoldii (mean proportion of G. swidsinkii/G. leopoldii was 48% vs 4%, respectively; Fig. 2). A third profile contained a lower overall relative abundance of Gardnerella such that it precluded assignment to a metagenomic subspecies. We refer to these metagenomic subspecies as Gardnerella types 1, type 2, and "other", respectively. Based on clustering of all samples, 7 major metagenomic clusters (mgClust) were identified (Fig. 3): mgClust1 was dominated by L. iners (mean relative abundance 89%); mgClust2 by a mix of L. iners (54%) and Gardnerella "other" (26%); mgClust3 by Gardnerella type 1 (70%); mgClust4 by P. bivia, A. vaginae, and Gardnerella (a mix of all three metagenomic subspecies), mgClust5 by L. iners, Gardnerella type 2, and Ca. Lachnocurva vaginae; and mgClust6 by Gardnerella type 2, L. iners, and P. bivia. Finally, mgClust7 comprised predominantly L. crispatus (86%) with minor co-occurrence of other lactobacilli.
The relative distribution of vaginal specimens across metagenomic clusters at baseline also differed by HIV serostatus and viral load (Fig. 4). Compared to participants without HIV, women with HIV overall had higher prevalence of microbiota dominated by Gardnerella type 2 and other mixed anaerobes in mgClust5 (17% vs. 6%; p = 0.02) and mgClust6 (27% vs. 11%; p = 0.002), and markedly lower prevalence of the L. crispatus-dominant mgClust7 (4% vs. 23%; p = 0.001). While women with HIV had modestly higher prevalence of mgClust4 Vaginal inflammation and microbiome characteristics. In vaginal specimens collected at 16-20 weeks, moderate positive correlations were noted between log-transformed concentrations of IL-1β, IL-10, and sCD14 and relative abundances of Gardnerella type 2, A. vaginae, and P. bivia, while negative correlations were found with L. crispatus and Gardnerella "other" (Table 2). Conversely, a moderate negative correlation existed between SLPI and relative abundance of Gardnerella type 2 and A. vaginae, and positive correlations with Gardnerella "other" and L. crispatus. www.nature.com/scientificreports/ IL-1β, IL-10, and sCD14 concentrations were associated with higher SDI, while IL-2, IL-6, IL12p70, and SLPI were each modestly associated with lower SDI (Table 3). Inflammatory scores increased with SDI (coeff+ 0.66, 95%CI 0.28, 1.03; p = 0.001); both were highest among specimens in mgClust2, mgClust4, mgClust5, and mgClust6 (Fig. 5), and were moderately correlated overall (rho + 0.3; p < 0.001). Compared to L. crispatus-dominated mgClust7, both SDI and inflammatory scores were significantly higher in specimens of anaerobe-abundant mgClust2 through mgClust6, but only modestly higher in specimens of L. iners-dominated mgClust1 (Table 4). Both SDI and inflammatory scores were lowest in samples with Gardnerella "other" and significantly higher in those with Gardnerella type 2 ( Table 5). The highest SDI and inflammatory scores were noted in samples collected from participants with Gardnerella type 2, regardless of HIV status.
Baseline vaginal inflammatory scores were higher among participants who experienced sPTB (median 0.90, IQR: 0.03-1.28) compared to those who delivered at term (median 0.50, IQR: − 0.73-1.04) ( Table 6). The prevalence of sPTB increased with higher vaginal inflammatory scores at baseline in models weighted for sampling and adjusted for maternal age, BMI, parity, prior PTB, and HIV serostatus (APR 2.8, 95% CI: 1.5, 5.2; p = 0.001). Baseline SDI were similar between those who experienced sPTB and those who delivered at term.  Nine (14%) participants with repeat vaginal samples analyzed delivered spontaneously before term. SDI increased between baseline and repeat collection timepoints among women who went on to have a sPTB (median + 0.18, IQR: 0.11-0.21) while it decreased among those who delivered at term (median − 0.11, IQR: − 0.54-0.37). In adjusted models, increasing SDI predicted sPTB (APR 2.5; 95%CI 1.1, 5.6). Vaginal inflammatory scores increased between baseline and repeat specimens modestly more among participants who experienced sPTB (median + 0.37, IQR: − 0.07-0.48) compared to those who delivered at term (median − 0.11; IQR: − 0.81-0.72), but confidence intervals were wide and included the null in weighted multivariable models.

Discussion
We employed metagenomic sequencing of the vaginal microbiome and assays of local inflammatory markers to investigate the relationships between the microbiome and sPTB in a cohort of pregnant women with and without HIV in Zambia. Our analysis confirms a high prevalence of diverse, anaerobe-rich microbiota in our cohort overall, and particularly among women with HIV. In this analysis, vaginal samples were classified into 7 distinct types based on clustering metagenomic content, whose distribution varied by maternal HIV and, in some cases, by viral suppression status. Two groups dominated by Gardnerella and L. iners species predicted sPTB. We identified two Gardnerella metagenomic subspecies (a group of co-occurring strains/species of Gardnerella defined by gene content) whose prevalence varied by maternal HIV serostatus and viral suppression, were associated with varying levels of microbial diversity and vaginal inflammation, and differentially predicted sPTB. These Gardnerella metagenomic subspecies were distinguished by the proportion of G. swidsinkii/G. leopoldii, two closely related Gardnerella subspecies 25 . Gardnerella metagenomic subspecies type 1, dominated by G. swidsinkii/G. leopoldii, was modestly less abundant both in women with HIV and those who delivered sPTB. In contrast, Gardnerella metagenomic subspecies type 2, notable for a highly diverse mix of other subspecies of Gardnerella, was more abundant among women with HIV and those who delivered sPTB. Vaginal microbiota www.nature.com/scientificreports/ dominated by L. crispatus, although very uncommon among women with HIV and rare overall, did not confer the anticipated protective effect against preterm birth as presented in other cohorts 32 . Finally, we described multiple correlations between vaginal microbiome and inflammatory markers and found that both vaginal inflammation among all participants at baseline and an increase in microbial diversity through pregnancy among a subset with HIV were associated with sPTB. Similar to other studies during and outside of pregnancy, pregnant women with HIV in ZAPPS exhibited more diverse and anaerobe-rich vaginal microbiota compared to women without HIV. However, a preponderance of literature in sub-Saharan Africa demonstrates that it is Lactobacillus-deficient, anaerobe-rich vaginal microbiota that confers a higher susceptibility to HIV infection itself, such that it remains unclear whether HIV is a cause or effect of increased microbial diversity and vaginal inflammation in pregnancy. The uncertain  Table 2. Correlation between key relative bacterial abundances and log cytokine concentrations, represented as Spearman rank-order coefficients (rho). Bolded coefficients represent moderate correlation (i.e., |.3| to |.5|) between bacterial abundance and cytokine concentration and with p < 0.001, adjusted for multiple comparisons by Bonferroni correction. www.nature.com/scientificreports/ causal relationship between HIV infection and the vaginal microbiome is further complicated by broader shifts towards Lactobacillus dominance mediated by a physiologic estrogen excess during pregnancy, which may be disrupted by HIV-related chronic inflammation, immune reactivation with ART initiation, or certain antiretroviral agents. Although our analysis was limited in size, women who had started ART prior to conception had a reduction in vaginal inflammation throughout pregnancy compared to those who had not 13 , while the change  www.nature.com/scientificreports/ in alpha diversity (SDI) trended in the opposite direction. This may indicate that ART initiation activates local inflammatory pathways with either minimal or modest benefit on the microbial milieu, but longitudinal analyses of vaginal microbiota from preconception through pregnancy, and from pre-and post-ART initiation, are needed to confirm this hypothesis. Furthermore, whereas nearly all participants with HIV in ZAPPS were taking efavirenz-based regimens, dolutegravir-based ART is now recommended as first line instead such that future research will need to address any differential effects between these exposures. In previous reports derived from the same ZAPPS cohort, we described associations between HIV and sPTB 33 , HIV and anaerobe-rich microbiota 12 , ART initiation and vaginal inflammation 13 , and vaginal inflammation among women who experienced sPTB 13 . In this analysis that employed updated metagenomic sequencing, linked the vaginal microbiome to inflammation, and analyzed associations between the microbiome and birth outcomes, we found correlations between vaginal inflammatory markers and microbiome and demonstrated that metagenomic characteristics more common among women with HIV were associated with inflammation and sPTB. Using 16S rRNA gene sequencing to characterize the vaginal microbiota, Gudza-Mugabe and colleagues reported that, whereas pregnant women living with HIV in Zimbabwe had higher prevalence of diverse, anaerobe-rich microbiota, and moderate correlations were noted between certain bacterial taxa and vaginal cytokines, no association was found between HIV and inflammation, and the higher risk of preterm birth among women with HIV was independent of the vaginal microbiota. In contrast, we found characteristics of the vaginal microbiome and inflammation predicted sPTB even after adjusting for HIV serostatus and viral suppression. Methodological differences may limit direct comparison between the Gudza-Mugabe report and our study, including methods of estimating gestational age, gestational ages at assessment, distinctions in viral load suppression and ART initiation timing, and the higher resolution in Gardnerella speciation by metagenomic sequencing. Furthermore, HIV differentially increases the risk of spontaneous over provider-initiated preterm deliveries such that risk estimates and associations may be blunted when examining PTB overall 33 ; this may partly explain the null results reported by others.
The direct causes of sPTB are often unknown, but overt infection leading to inflammation and immune activation are common antecedents. In concert with findings among pregnant and non-pregnant women 6,34-37 , we found moderate correlations between vaginal inflammation and the composition of the vaginal microbiome at baseline, and independent associations between the vaginal microbiome and risk of sPTB. However, over half of our participants had anaerobe-rich Lactobacillus-deficient type of vaginal microbiota and a similar proportion experienced sPTB as those with microbiota dominated by L. crispatus, commonly considered protective and anti-inflammatory. Although we noted a trend in L. iners-dominant communities found more commonly  www.nature.com/scientificreports/ in women with late sPTB between 34 and 36 weeks compared to G. vaginalis-rich communities in women with earlier sPTB < 34 weeks, additional studies with larger sample sizes are needed to confirm whether vaginal microbiome composition differentially predicts severity of prematurity. Furthermore, clear geographic variations in common vaginal microbial characteristics and the associated risk of preterm birth highlight the need for population-specific approaches to classifying the vaginal microbiome and to identifying women at highest risk who will most benefit from preventive therapies 5,7,[38][39][40][41] . In our population, relative abundance of metagenomic subspecies of Gardnerella may better convey risk or protection than uncommon Lactobacillus species or community state type classifications. Further examination of the functional make-up of L. crispatus in this cohort and others where it was found protective is warranted. We acknowledge several limitations to this analysis. Because of the nature of our observational cohort that relied on standard antenatal care practices, we could not investigate the role of sexually transmitted infections other than HIV and syphilis and we were limited by some missingness in baseline covariates. Additionally, we did not collect data on recent antibiotic use, which could differ by HIV serostatus but would likely bias our findings toward no association; nonetheless, recent antibiotic use was conceivably uncommon in the baseline cohort of samples collected at the first antenatal visit. Due to our sample selection procedure, the characteristics Figure 6. Prevalence of metagenomic clusters (mgClust) at 16-20 gestational weeks among participants with term birth, spontaneous preterm birth at 34-36 weeks (sPTB [34][35][36], and spontaneous preterm birth before 34 weeks (sPTB < 34). Relative percents weighted for sampling and p values calculated by weighted Poisson regression of prevalence of mgClust between preterm birth outcomes compared to term; * p < 0.05; ** p < 0.001. Table 6. Shannon diversity index and vaginal inflammatory scores at baseline (16-24 weeks) and the change (Δ) from baseline to repeat (24-36 gestational weeks) between participants experiencing term birth and spontaneous preterm birth (sPTB). Multivariable model estimates of the prevalence of spontaneous preterm birth (APR) calculated by weighted Poisson regression with robust error variance and adjusted for HIV serostatus (in baseline models) or detectable viral load (in change models), maternal age, body mass index, parity, and prior PTB. www.nature.com/scientificreports/ of this nested study are not directly generalizable to the full cohort. Although we weighted analyses to account for sample selection using inverse probabilities, we are currently undertaking more comprehensive analyses in a much larger sample of our cohort population to better characterize the interactions between HIV and ART, the vaginal microbiome and inflammation, and adverse birth outcomes. Similarly, due to limited funding, the current analyses were unable to investigate longitudinal differences between participants with and without HIV. The use of metagenomic sequencing is a strength in this study and afforded a higher resolution of Gardnerella which 16S rRNA gene sequencing cannot provide. In summary, pregnant women in Zambia have high prevalence of anaerobe-rich vaginal microbiota correlated with local inflammation, and women with HIV exhibit characteristics of the vaginal microbiome associated with spontaneous preterm birth. These findings suggest the risk of preterm birth faced by women with HIV may be mediated by the vaginal microbial and inflammatory environment and could be a target for novel preventive therapies aimed at restoring a protective vaginal microenvironment. However, since many women with diverse vaginal microbiota and inflammation still deliver at term and certain species and subspecies differentially confer risk across populations, additional research is needed to identify women who would most benefit from intervention, to define how risk is modified by other host factors, and to tailor interventions to population and individual risk profiles.